Automatically Extracting Templates from Examples for NLP Tasks
نویسندگان
چکیده
In this paper, we present the approaches used by our NLP systems to automatically extract templates for example-based machine translation and pun generation. Our translation system is able to extract an average of 73.25% correct translation templates, resulting in a translation quality that has a low word error rate of 18% when the test document contains sentence patterns matching the training set, to a high 85% when the test document is different from the training corpus. Our pun generator is able to extract 69.2% usable templates, resulting in computer-generated puns that received an average score of 2.13 as compared to 2.7 for human-generated puns from user feedback.
منابع مشابه
A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملAutomatic Translation Template Acquisition Based on Bilingual Structure Alignment
Knowledge acquisition is a bottleneck in machine translation and many NLP tasks. A method for automatically acquiring translation templates from bilingual corpora is proposed in this paper. Bilingual sentence pairs are first aligned in syntactic structure by combining a language parsing with a statistical bilingual language model. The alignment results are used to extract translation templates ...
متن کاملLarge - Scale Semi - Supervised Learning for Natural Language Processing
Natural Language Processing (NLP) develops computational approaches to processing language data. Supervised machine learning has become the dominant methodology of modern NLP. The performance of a supervised NLP system crucially depends on the amount of data available for training. In the standard supervised framework, if a sequence of words was not encountered in the training set, the system c...
متن کاملEntropy Guided Transformation Learning
This work presents Entropy Guided Transformation Learning (ETL), a new machine learning algorithm for classification tasks. It generalizes Transformation Based Learning (TBL) by automatically solving the TBL bottleneck: the construction of good template sets. We also present ETL Committee, an ensemble method that uses ETL as the base learner. The main advantage of ETL is its easy applicability ...
متن کاملAutomatic Extraction of Briefing Templates
An approach to solving the problem of automatic briefing generation from non-textual events can be segmenting the task into two major steps, namely, extraction of briefing templates and learning aggregators that collate information from events and automatically fill up the templates. In this paper, we describe two novel unsupervised approaches for extracting briefing templates from human writte...
متن کامل